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Prediction  of  User  Preference  over  Shared-Control  Paradigms 

for  a  Robotic  Wheelchair 

Ahmetcan  Erdogan1  and  Brenna  D.  Argali2 


Abstract —  The  design  of  intelligent  powered  wheelchairs  has 
traditionally  focused  heavily  on  providing  effective  and  efficient 
navigation  assistance.  Significantly  less  attention  has  been 
given  to  the  end-user’s  preference  between  different  assistance 
paradigms.  It  is  possible  to  include  these  subjective  evaluations 
in  the  design  process,  for  example  by  soliciting  feedback  in 
post-experiment  questionnaires.  However,  constantly  querying 
the  user  for  feedback  during  real-world  operation  is  not 
practical.  In  this  paper,  we  present  a  model  that  correlates 
objective  performance  metrics  and  subjective  evaluations  of 
autonomous  wheelchair  control  paradigms.  Using  off-the-shelf 
machine  learning  techniques,  we  show  that  it  is  possible  to 
build  a  model  that  can  predict  the  most  preferred  shared- 
control  method  from  task  execution  metrics  such  as  effort, 
safety,  performance  and  utilization.  We  further  characterize  the 
relative  contributions  of  each  of  these  metrics  to  the  individual 
choice  of  most  preferred  assistance  paradigm.  Our  evaluation 
includes  Spinal  Cord  Injured  (SCI)  and  uninjured  subject 
groups.  The  results  show  that  our  proposed  correlation  model 
enables  the  continuous  tracking  of  user  preference  and  offers 
the  possibility  of  autonomy  that  is  customized  to  each  user. 

I.  Introduction 

Robotics  autonomy  has  the  potential  to  assist  an  estimated 
1.4  to  2.1  million  powered  wheelchair  users  in  the  United 
States  [1],  Recent  years  in  particular  have  been  marked  by 
the  development  of  different  control  paradigms  that  share 
the  control  with  user  in  various  ways.  Deciding  exactly  how 
much  and  how  often  assistance  should  be  provided  is  critical 
for  end-user  acceptance,  and  will  be  a  key  factor  in  the  large- 
scale  adoption  of  these  systems. 

Autonomy  in  wheelchairs  can  be  tuned  according  to  tradi¬ 
tional  robotics  metrics  that  prioritize  efficiency  and  success 
in  task  performance.  However,  an  additional  consideration  is 
the  fact  that  most  end-users  prefer  to  retain  as  much  control 
as  is  possible  [2],  Moreover,  each  user  is  unique  in  their 
needs,  personal  preferences  and  desired  level  of  assistance. 
Maintaining  a  balance  between  performance  and  end-user 
preference,  therefore,  is  essential  in  determining  the  proper 
sharing  of  control  between  the  user  and  autonomy. 

Within  the  smart  (i.e  robotics)  wheelchair  literature,  sub¬ 
jective  evaluations  are  most  commonly  utilized  in  order 
to  gather  opinions  about  no  autonomy  (only  human  input) 
versus  full  autonomy  (no  human  input),  or  a  single  shared- 
control  paradigm  (a  combination  of  human  and  autonomy 
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input).  There  exist  a  limited  number  of  studies  that  compare 
multiple  control  paradigms  on  hardware,  and  none  of  these 
studies  investigate  potential  connections  between  objective 
execution  metrics  and  subjective  user  preference  metrics. 

In  order  to  bridge  the  gap  between  performance  and  prefer¬ 
ence,  our  work  aims  to  experimentally  model  the  correlations 
between  subjective  and  objective  metrics  across  multiple 
shared-control  paradigms.  We  have  previously  compared 
multiple  shared-control  paradigms  and  control  interfaces  in  a 
systematic  experiment  that  spans  multiple  sessions  [3],  The 
high  variability  observed  in  the  most  effective  and  accepted 
paradigms  between  subjects  and  over  control  interfaces  sup¬ 
ports  the  idea  of  offering  end-users  multiple  control  options 
to  accommodate  their  individual  needs  and  preferences. 

It  however  may  not  be  practical  to  continuously  query  the 
user  for  their  preference.  Therefore,  we  extend  our  previous 
work  by  modeling  the  correlation  between  task-related  met¬ 
rics  and  post-experiment  subjective  evaluations.  Our  model 
successfully  captures  the  experimentally-observed  changes 
between  subject  groups  and  sessions,  while  providing  unique 
insight  into  the  relative  contribution  of  task  metrics  such  as 
effort,  safety,  performance  and  utilization.  Our  overarching 
aim  is  to  continuously  estimate  user  preference  over  multiple 
assistance  paradigms  which  can  then  be  used  to  modify 
and/or  switch  the  way  in  which  control  is  shared. 

The  rest  of  the  paper  is  organized  as  follows.  Section  II 
provides  a  review  of  related  literature  on  smart  wheelchair 
research.  Section  III  details  the  control  and  hardware  archi¬ 
tectures  of  our  NURIC  Smart  Wheelchair,  including  the 
four  tested  shared-control  paradigms.  The  correlation  model 
and  experimental  results  are  provided  in  Sections  IV  and  V. 
Section  VI  concludes  the  paper. 

II.  Background 

Exploratory  qualitative  studies  that  investigate  the  perspec¬ 
tives  of  caregivers,  therapists  and  powered  wheelchair  users 
are  common  in  the  smart  wheelchair  domain.  In  these  studies, 
open-ended  questions  are  discussed  to  characterize  the  needs 
and  desires  of  these  populations — such  as  decreased  risk 
of  collisions  and  increased  social  behavior  [4],  To  address 
these  desires,  different  shared-control  approaches  have  been 
proposed  that  span  a  wide  range  of  options  between  the  two 
extremums  of  full  autonomy  and  full  teleoperation. 

Studies  that  investigate  the  subjective  evaluation  of  these 
systems  highlight  some  key  requirements  from  the  users. 
End-users  are  shown  to  be  most  frustrated  with  full  auton¬ 
omy  because  they  feel  that  they  have  the  least  control  [5], 


and  they  demonstrate  a  high  desire  to  be  in  control  when  they 
are  asked  to  evaluate  different  controllers  in  simulation  [6]. 

There  are  also  shared-control  methods  that  rely  on  user 
input  for  the  entire  trajectory  in  order  to  accommodate 
the  desire  to  be  in  control.  This  approach  necessitates  a 
way  to  reason  between  two  (possibly  conflicting)  signals 
generated  by  the  user  and  autonomy.  One  possibility  is  to 
continually  blend  these  two  signals  in  a  weighted  sum,  where 
the  blending  ratio  may  be  constant  or  change  with  respect  to 
metrics  like  comfort,  transparency  or  safety  [7].  A  different 
approach  is  to  partition  the  control  space  between  the  human 
and  the  autonomy.  Here,  the  aim  often  is  to  find  a  safe 
heading  that  is  as  close  to  the  user  input  as  possible  [8], 

Some  studies  within  the  literature  augment  temporal  and 
kinematic  quantitative  task  execution  metrics  with  subjective 
evaluations  through  questionnaires  that  solicit  preference 
from  the  users  with  or  without  autonomy;  for  example,  the 
NASA  TLX  questionnaire  is  used  to  observe  workload  from 
the  user’s  perspective  [9].  These  approaches,  however,  may 
not  generalize  depending  on  the  impairment  of  the  subject, 
and  continuously  querying  the  user’s  satisfaction  with  the 
autonomy  may  undermine  its  acceptance. 

One  possible  alternative  is  to  observe  continuous  indi¬ 
cations  of  the  user’s  condition  via  additional  physiological 
measurements  (e.g  skin  conductivity  gauges  or  heart  rate 
monitors).  These  sensors  provide  a  continuous  metric  that 
can  be  incorporated  into  the  control  paradigm;  for  example, 
by  relating  pulsioximeter  readings  to  user  anxiety  [10]. 
However,  such  sensors  introduce  additional  complexity  and 
the  results  are  susceptible  to  external  disturbances. 

The  approach  we  present  in  this  paper  instead  builds  a 
correlation  model  between  subjective  evaluations  (of  user 
opinion  and  preference)  and  objective  measures  (effort, 
safety,  performance  and  utilization)  available  from  the  sen¬ 
sors  already  onboard  the  robot  platform  (to  enable  the  au¬ 
tonomy  capabilities).  With  such  a  model,  it  becomes  feasible 
to  automatically  adapt  which  shared-control  paradigm  is 
employed  by  the  autonomy,  which  might  help  to  facilitate  the 
adoption  of  autonomy  capabilities  in  powered  wheelchairs. 

III.  Experimental  Evaluation 

In  this  work,  we  develop  an  experimentally-derived  model 
to  investigate  the  relationship  between  objective  and  subjec¬ 
tive  metrics  of  a  shared-control  task  with  an  assistive  robot. 
The  model  aims  to  estimate,  from  task  execution  metrics,  a 
user’s  preference  over  multiple  shared-control  paradigms. 

Utilizing  our  modular  software  structure,  we  have  pre¬ 
viously  performed  a  multi-session  experiment  that  evalu¬ 
ated  user  performance,  effort  and  preference  over  assistive 
paradigms  and  control  interfaces  [3].  This  section  summa¬ 
rizes  the  experimental  platform  and  protocol  for  that  study. 

A.  Experimental  Platform 

We  first  present  a  brief  summary  of  the  control  and  hard¬ 
ware  architecture  of  our  NURIC  Smart  Wheelchair  [3]. 
The  software  and  hardware  components  are  modular  and 
customizable — allowing  for  various  control  formulations  and 
sensors  to  be  swapped  in  or  out. 


Fig.  1:  NURIC  Smart  Wheelchair  hardware  configura¬ 
tion.  The  base  system  consists  of  an  RGB-D  sensor,  a  mini- 
PC,  converter  boards  and  wheel  encoders.  Additional  hard¬ 
ware  can  be  added  based  on  a  user’s  needs  and  preferences. 


1 )  Control  Architecture:  The  control  framework  consists 
of  a  modular  software  system  which  can  be  broadly  char¬ 
acterized  as  high-level  and  low-level  behaviors,  and  a  set  of 
reasoning  modules  for  command  and  goal  arbitration. 

In  particular,  each  high-level  behaviors  fhi{-)  €  Fhi 
outputs  a  goal  g 

g  fhi(x-)  (1) 

based  on  the  current  state  x  of  the  robot  and  the  ob¬ 
servable  environment.  These  individual  goals  are  generated 
from  perception  algorithms  that  identify  locations  of  interest 
and  inference  algorithms  that  estimate  intent  from  user 
commands.  Examples  of  implemented  high-level  behaviors 
include  traversing  doorways,  docking  at  tables  and  desks  and 
driving  up  ramps — all  commonly  challenging  scenarios  for 
end-users.  Once  the  goal  set  is  populated  g  £  Q  (which  also 
might  be  empty),  the  goal  arbitration  module  computes  a 
confidence  cg  for  each  element  based  on  conflicts,  feasibility, 
perception  confidence  and  agreement  with  human-generated 
commands.  From  this,  the  highest-confidence  goal  g*  £  Q 
that  is  above  a  predefined  threshold  is  selected.  Low-level 
behaviors  //„(■)  £  T\n  then  output  a  control  command  ur 

U  r^/io(x,g*)  (2) 

based  on  the  most  confident  goal  g*  and  state  x.  The 
command  arbitration  module  reasons  between  the  autonomy¬ 
generated  command  ur  and  the  human-generated  command 
Uh-  Specifically,  arbitration  function  /?(•) 

u  /3(uh,  ur)  (3) 

generates  a  command  u — which  consists  of  translational  v 
and  rotational  oj  speed  components — that  drives  the  non- 
holonomic  robot  system. 

2)  Hardware:  The  mechanical  design  of  the  NURIC 
Smart  Wheelchair  intentionally  utilizes  commercial 
products  to  facilitate  practical  adoption  by  users.  Specifically, 
our  system  is  built  on  a  commercially-available  powered 
wheelchair,  a  Permobil  C300  (Timra,  Sweden),  which  we 
then  outfit  with  additional  components  including  a  computer, 
electronics  and  sensors  (Fig.  1). 


Filtered  [  immediate  goal  ]  Blended  [  immediate  goal  ]  Blended  [  perception  goal  ]  Switch  [  perception  goal  ] 


Commands:  ^  Human  -^—Autonomy  Shared-Control  (executed) 


Fig.  2:  The  four  shared-control  paradigms  of  our  study  (P1-P4),  which  differ  according  to  how  the  autonomy  commands  are 
generated  (immediate  or  perception  goals)  and  how  the  control  is  shared  between  the  user  and  autonomy  (blending,  filtering 
or  switching).  In  particular,  navigation  goals  (blue  circle)  for  the  autonomy  are  inferred  simply  from  a  brief  (0.5  sec)  forward 
projection  of  the  human’s  current  control  command  (immediate  goal,  PI  and  P2)  or  from  a  higher-level  perception  goal 
(doorway)  detected  from  sensor  data  (perception  goal,  P3  and  P4).  In  either  case,  the  same  planner  is  used  to  generate  a  path 
(dashed  blue  line)  to  the  goal,  and  the  same  controller  is  used  to  drive  that  path.  The  human  command  might  be  linearly 
blended  [7]  with  the  autonomy  command  (Blended,  P2  and  P3)  or  be  capped  [11]  to  not  exceed  the  autonomy  command  in 
safety-critical  (e.g.  collision  with  an  obstacle)  situations  (Filtered,  PI),  or  the  autonomy  might  take  over  100%  control  [12] 
when  relinquished  by  the  human  (Switch,  P4).  Doorway  shown  as  a  gap  in  the  top  gray  line,  robot  footprint  as  a  green 
outline  and  obstacle  as  a  gray  box. 


To  interface  with  the  proprietary  wheelchair  control  loop, 
we  use  the  expandable  input  OMNI  interface  from  R-Net 
Electronics  (Christchurch,  UK) — originally  designed  to  en¬ 
able  the  use  of  third-party  control  interfaces.  Our  computer¬ 
generated  control  commands  u  can  drive  the  wheelchair 
by  mimicking  a  regular  inductive  joystick  via  a  voltage 
regulator.  Additional  hardware  add-ons  include  an  onboard 
computer  (mini-PC)  that  is  powered  by  the  wheelchair  bat¬ 
teries,  electronics  boards,  override  buttons,  wheel  encoders 
and  a  top-mounted  RGB-D  sensor  (Asus  Xtion). 

B.  Shared-Control  Paradigms 

Our  prior  study  evaluated  four  shared-control  paradigms 
(Fig.  2),  sampled  from  the  literature  on  wheelchair  navigation 
assistance.  The  paradigms  differ  in  the  way  that  (i)  the 
control  authority  is  allocated  and  (ii)  autonomy  goals  are 
generated. 

The  overall  purpose  of  each  control  paradigm  is  to  com¬ 
pute  a  control  input  u  =  [i; ,  w]  that  takes  into  consideration 
the  user  signal  Uh  =  [vh^h],  autonomy  signal  ur  =  [vr,cjr\ 
and  environment  information. 

C.  Experimental  Protocol  and  Motivating  Results 

This  section  summarizes  the  experimental  protocol  of  our 
previous  study  that  investigated  user  performance,  preference 
and  effort  under  different  shared-control  paradigms  and 
control  interfaces.  Readers  are  referred  to  [3]  for  a  detailed 
analysis  of  this  experiment. 

Participants  were  asked  to  follow  a  predefined  path  in 
our  laboratory  that  traversed  four  doorways  (Fig.  3).  Due 
to  the  challenging  nature  of  the  task  and  the  prerequisite  of 
dexterous  control,  doorway  navigation  has  been  evaluated  in 
multiple  smart  wheelchair  systems  [13]  and  it  is  one  of  the 
tasks  in  the  Powered  Wheelchair  Skills  Test  [14]  that  assess 
the  ability  of  end-users  to  safely  pilot  a  powered  wheelchair. 


Experiment  participants  included  7  SCI  subjects  (36-68 
years  old)  and  7  uninjured  subjects  (23-37  years  old).  On 
average,  it  had  been  23.6±11.0  years  since  injury  and  the 
SCI  subjects  had  used  a  powered  wheelchair  for  21. Oil  1.4 
years.  The  uninjured  subjects  had  varying  experience  with 
robotic  systems  but  were  mostly  naive  to  wheelchair  driving. 

The  overall  experiment  was  divided  into  four  phases,  each 
of  which  started  two  meters  away  from  a  door  ( t  =  to)  and 
ended  when  the  user  safely  traversed  the  doorway  ( t  =  fy). 
All  experimental  data  was  collected  via  the  ROS  pipeline  and 
the  majority  was  sampled  at  25  Hz  (with  the  exception  of 
computationally  expensive  topics  such  as  the  2-D  costmap, 
which  was  sampled  at  7  Hz).  MATLAB  was  used  to  segment 
the  doorway  traversal  time  intervals  and  for  data  processing. 

In  a  session,  each  navigation  assistance  paradigm  was 
evaluated  twice  and  presented  in  a  predefined  randomized 
order.  Subjects  were  asked  to  perform  a  secondary  session 
at  least  one  day  and  no  more  than  14  days  after  the  first 
session  to  help  identify  learning  artifacts. 

Upon  completing  a  session,  subjects  (i)  indicate  their 
most  preferred  control  method  and  (ii)  fill  out  the  subjective 
evaluation  questionnaires  for  each  assistance  paradigm — 
which  queried  for  the  user’s  trust  in,  and  perceived  utility  and 
contribution  of,  the  autonomy  over  a  7-point  Likert  scale. 

Post-experiment  analysis  of  the  subjective  evaluations 
showed  that  7  of  the  14  participants  chose  a  different 
paradigm  as  most  preferred  in  their  second  session.  We  also 
observed  noticeable  (and  sometimes  significant)  differences 
in  the  performance,  effort  and  preference  of  users  under  dif¬ 
ferent  shared-control  paradigms  [3].  In  this  paper,  we  extend 
our  analysis  to  build  a  correlation  model  that  maps  objective 
execution  metrics  to  subjective  evaluations.  Our  aims  are  (i) 
to  highlight  the  reasons  for  observed  differences  between 
subject  groups,  sessions  and  assistance  paradigms,  and  (ii)  to 


Fig.  3:  Sample  experimental  run,  originally  presented  in  [3]. 
Doorway  traversal  phases  shown  as  differently  colored  lines, 
and  black  dots  are  projected  sensor  data. 

predict  subjective  evaluations  online,  which  potentially  could 
be  used  to  switch  or  modify  autonomy  paradigms  on  the  fly. 

IV.  Correlation  Model  between  Objective  and 
Subjective  Metrics 

Our  approach  models  the  correlation  between  user  pref¬ 
erence  and  execution  metrics.  This  section  formulates  the 
objective  task  execution  metrics  that  are  the  inputs  to  the 
correlation  model,  and  then  details  the  structure  of  the  model. 

A.  Execution  Metrics 

For  the  execution  metrics,  four  characteristics  of  robot 
operation  upon  which  shared-control  paradigms  frequently 
are  built  [7]  are  chosen:  effort,  safety,  performance  and 
utilization.  In  particular,  the  chosen  task-specific  metrics  are: 

•  Task  Completion  Time'.  T  =  tif—to ,  provides  a  measure 
of  task  performance.  Here  to  and  Ly  represent  the 
starting  and  ending  time  of  each  doorway  traversal. 

•  Minimum  Distance  to  Obstacles:  D  =  ^  Et^  IMtll- 
provides  a  measure  for  the  user  safety.  Here  ||dj||  is  the 
minimum  distance  between  the  wheelchair  footprint  and 
obstacles  and  N  is  the  number  of  samples. 

•  Similarity  of  User  and  Executed  Commands:  S  =  1  — 
W  Et”  HI®*  — fihll*  provides  insight  into  utilization  of 
autonomy  by  comparing  the  executed  command  uf  with 
the  user  input  u{j. 

•  Mean  Frequency  of  User  Commands:  M  = 
lr  E lo  El  /i-Pi/Ei  E,  provides  insight  into  user 
effort.  (The  mean  frequency  of  surface  EMG  signals 
has  been  shown  to  indicate  muscle  fatigue  [15].)  Here 
fi  is  the  frequency  value  of  the  user  signal’s  power 
spectrum  Pi  at  frequency  bin  i,  and  L  is  the  length  of 
the  frequency  bin. 


Note  that  command  frequency  and  similarity  are  metrics 
specific  to  shared-control  systems.  By  contrast,  execution 
time  and  distance  to  obstacles  are  common  performance- 
related  metrics  used  to  evaluate  autonomous  robotic  systems. 

B.  Correlation  Models 

For  the  design  of  the  correlation  model,  a  cascaded 
model  structure  is  chosen.  Specifically,  this  model  maps  (A) 
objective  task  execution  metrics  to  subjective  evaluations 
(scored  on  a  Likert  scale)  and  (B)  these  estimated  subjective 
evaluations  to  the  most  preferred  assistance  paradigm. 

Eight  pairs  of  A  and  B  models  are  built,  one  for  each  com¬ 
bination  of  subject  group  (2)  and  shared-control  paradigm 
(4).  The  end-to-end  operation  thus  is  to  predict  a  distribu¬ 
tion  over  shared-control  paradigms  from  observed  objective 
metrics,  conditioned  on  the  control  paradigm  (and  subject 
affiliation)  in  use  when  the  metrics  are  gathered.  The  dataset 
for  each  model  contains  only  14  samples  (7  subjects  in  each 
of  2  sessions).  Training  and  evaluation  therefore  is  performed 
using  7-folds  cross-validation,  where  each  fold  is  a  split  of 
70%  training,  15%  testing  and  15%  validation  data  (used 
during  training  to  prevent  over-fitting). 

We  evaluate  a  variety  of  off-the-shelf  machine  learning 
tools  to  build  each  model:  including  linear  regression  with 
Lasso  regularization,  decision  and  regression  trees,  support 
vector  regression  and  neural  networks  (all  implementations 
are  performed  in  MATLAB  using  the  Statistics  and  Machine 
Learning  and  Neural  Network  Toolboxes).  For  all  models,  the 
best  regression  performance  is  achieved  with  a  multi-layer 
feed-forward  neural  network. 

1 )  Model  A:  Prediction  of  Subjective  Evaluations:  The 
first  model  (A)  of  our  cascade  predicts  the  subjective  scores. 
Its  input  vector  z  £  R4  contains  the  4  execution  metrics 
(Sec.  IV-A)  computed  over  2  trials.  Its  output  vector  y  £ 
R3  contains  the  3  subjective  evaluations  (trust,  contribution, 
utility)  rated  on  a  Likert  score.  Each  prediction  y  £  y  is 
continuous-valued  and  lies  in  y  £  [1,7]. 

A  grid  search  is  performed  to  optimize  the  number  of 
hidden  layers  and  units  that  minimizes  the  total  mean  squared 
error.  For  all  models,  2  hidden  layers  using  radial  basis 
activation  functions  with  a  single  linear  output  layer  are 
found  to  perform  the  best.  The  only  difference  between 
models  is  the  number  of  neurons  in  each  hidden  layer.  (SCI 
models:  PI  and  P2  5-5,  P3  and  P4  20-5,  uninjured  models: 
PI  5-5,  P2  40-5,  P3  5-5,  P4  20-5.) 

2)  Model  B:  Prediction  of  Most  Preferred  Paradigm:  The 
second  model  (B)  of  our  cascade  predicts  a  distribution  of 
preference  over  shared-control  paradigms.  Its  input  vector 
z  £  R3  contains  the  3  subjective  evaluations  (trust,  con¬ 
tribution,  utility)  predicted  by  the  first  model  (A).  Its  output 
vector  y  £  R5  is  a  probability  distribution  of  preference  over 
the  4  shared-control  paradigms  plus  a  paradigm  without  any 
autonomy  (direct  teleoperation). 

Models  are  trained  using  the  same  optimization  and  cross- 
validation  routine  described  above.  A  neural  network  with  2 
hidden  layers  again  performs  best,  this  time  using  hyperbolic 


Fig.  4:  Subjective  evaluation  scores,  measured  (solid  bars)  and  predicted  (dashed  bars),  averaged  across  subject  groups  and 
cross-validation  folds. 


tangent  sigmoid  activation  functions  and  a  soft  max  activa¬ 
tion  function  in  the  final  layer.  Again,  the  only  difference 
between  models  is  the  number  of  neurons  in  each  hidden 
layer  (all  SCI  models:  10-5,  all  uninjured  models:  5-5). 

V.  Experimental  Results 

We  evaluate  the  performance  of  the  cascaded  correlation 
model,  and  evaluate  the  relative  contribution  of  each  task 
metric.  Results  suggest  that  task  execution  metrics  can 
infer  user  preference,  even  under  different  shared-control 
paradigms,  subject  groups  and  sessions.  Furthermore,  a  rela¬ 
tive  contribution  analysis  is  performed  to  examine  the  effect 
of  different  task  execution  metrics  on  user  acceptance. 

A.  Prediction  of  Subjective  Evaluations 

The  performance  of  the  neural  network  regression  is  given 
in  Figure  4.  The  normalized  mean  square  error  averaged  over 
all  control  paradigms  and  cross-validation  folds  on  the  test 
dataset  is  2.0±1.2%  for  the  SCI  models  and  2.1  ±1.1%  for 
the  uninjured  models.  The  proposed  approach  thus  success¬ 
fully  estimates  subjective  evaluations  based  on  the  chosen 
objective  execution  metrics. 

B.  Relative  Contribution  of  Each  Objective  Metric 

We  further  analyze  the  contribution  of  each  performance 
metric  using  sensitivity  analysis.  Input  perturbation  is  shown 
to  effectively  capture  the  relative  contribution  of  each  input 
in  comparison  to  other  sensitivity  analysis  methods  [16]. 
This  method  compares  the  rate  of  change  in  the  mean 
squared  error  of  the  measured  and  estimated  values  when 
a  perturbation  is  applied  to  each  input  as  white  noise  with 
magnitude  between  5%  to  50%. 

For  this  analysis,  for  each  of  the  eight  subject-paradigm 
datasets,  we  select  from  the  7-folds  cross-validation  the 
best  performing  model.  The  entire  dataset  then  is  run,  with 
perturbations,  through  this  model. 

The  corresponding  mean  squared  error  percentages  are 
plotted  as  a  function  of  perturbation  amount  in  Figure  5.  The 
sensitivity  analysis  results  demonstrate  that  across  shared- 
control  paradigms,  there  is  a  high  variability  in  the  depen¬ 
dence  of  each  model  on  the  various  metrics.  That  is,  which 
metrics  have  the  greatest  impact  on  prediction  performance 
changes  depending  on  which  shared-control  paradigm  is  in 
use — which  helps  to  explain  why  we  see  better  performance 
when  we  partition  the  dataset  and  build  paradigm-specific 


models.  Moreover,  we  also  observe  differences  between  sub¬ 
ject  groups — meaning  that  subjects’  evaluations  of  a  control 
paradigm  are  influenced  by  different  metrics. 

C.  Prediction  of  Most  Preferred  Paradigm 

The  predicted  subjective  evaluation  scores  provide  a  quan¬ 
titative  evaluation  of  each  shared-control  paradigm.  From 
these  values,  estimating  the  most  preferred  paradigm  could 
enable  adaptive  assistance  based  on  the  user’s  preference. 

The  prediction  of  most  preferred  paradigm  is  more  com¬ 
plex  than  simply  selecting  the  highest-scoring  paradigm.  In  9 
out  of  28  cases,  the  most  preferred  shared-control  paradigm 
does  not  have  the  highest  evaluation  score.  Moreover,  in  3 
of  these  9  cases,  no  assistance  was  chosen  as  most  preferred. 

Our  method  computes  a  preference  distribution  over  con¬ 
trol  paradigms.  From  this,  we  compute  the  most  preferred 
paradigm  by  taking  the  maximum  of  this  distribution.1 
A  confusion  matrix  for  the  prediction  of  most  preferred 
paradigm  is  given  in  Figure  6.  The  test  data  accuracy 
averaged  over  all  models  is  74.1%  (while  chance  is  20%). 

Unsurprisingly,  performance  declines  with  diminishing 
instances  of  a  given  class.  Performance  overall  is  promis¬ 
ing,  though  still  leaves  room  for  improvement.  The  gap 
in  prediction  performance  might  be  addressed  with  a  more 
sophisticated  model  or  a  larger  dataset,  or  simply  might  be 
a  function  of  metrics  unobservable  to  our  current  robotic 
system  (e.g.  user  fatigue). 

VI.  Conclusion 

For  the  large-scale  adoption  of  smart  wheelchairs,  control 
sharing  needs  to  accommodate  each  user’s  unique  motor 
ability  and  personal  preference.  It  is  desirable  to  be  able  to 
predict  this  preference  without  needing  to  constantly  query 
the  user.  In  this  work,  we  have  shown  that  it  is  possible 
to  estimate  user  preference  based  on  objective  execution 
metrics  of  the  shared-control  task  chosen  from  the  literature: 
effort,  safety,  performance  and  utilization.  Results  from  our 
previous  multi-session  experiment  revealed  a  subset  of  users 
that  change  their  most  preferred  shared-control  paradigm 
between  sessions.  Our  proposed  model  successfully  captures 
the  relation  between  execution  metrics  and  user  opinion  in 
this  experimental  data,  including  these  preferences  changes. 

1  For  5  of  28  datapoints,  subjects  indicated  two  paradigms  equally  as  most 
preferred.  For  these  datapoints,  we  consider  the  prediction  of  either  as  most 
preferred  to  be  a  true  positive. 
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Fig.  6:  Confusion  matrix  of  predicted  preference  (predicted 
class)  versus  ground  truth  (target  class),  summed  over  all  8 
models  and  their  datasets  (112  samples).  For  each  column,  a 
cell  counts  the  number  of  predictions  of  that  target  class  as  a 
fraction  of  (a)  the  number  of  instances  of  the  class  (diagonal 
elements,  true  positives)  and  (b)  the  number  of  instances  of 
all  other  classes  (off-diagonal  elements,  false  positives). 


Fig.  5:  Relative  contribution  to  each  execution  metric  on 
the  prediction  of  subjective  evaluations,  for  SCI  (top)  and 
uninjured  (bottom)  subject  groups.  Different  metrics  shown 
as  colors,  different  control  paradigms  as  line  continuities. 

The  contribution  of  our  work  is  to  model  the  connection 
between  task  execution  metrics  and  user  preference  in  a 
human-robot  system,  while  highlighting  the  relative  contri¬ 
bution  of  each  of  these  objective  metrics  in  the  prediction  of 
most  preferred  paradigm. 

Our  future  work  includes  a  longitudinal  study  to  investi¬ 
gate  the  adaptation  of  users  to  the  autonomy  and  how  this 
affects  their  preference.  We  expect  that,  with  experience, 
dissimilarity  between  the  user  and  executed  signals  will 
saturate  to  a  steady  state,  and  that  the  resulting  change  in 
driving  characteristics  would  reshape  the  correlation  model. 
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